renv::restore()- The library is already synchronized with the lockfile.
Olink High-Throughput (HT) proteomic platform that combines the specificity of Proximity Extension Assay (PEA) with Next-Generation Sequencing (NGS) readout.
More generally, when the Plate Controls in a dataset differ from the reference Plate Control lot used by the analysis pipeline, an internal Plate Control Lot Factor can be applied to Plate Control extNPX values to align them to that reference.
\[ExtNPX_{i, PC} (\text{adjusted}) = ExtNPX_{i, PC} (\text{raw}) + \text{PC Lot Factor}_i\] Therefore, the actual PC normalization formula is: \[NPX_{i,j} = ExtNPX_{i,j} - \text{median}(ExtNPX_{i, \text{Plate Controls}}) - \text{PC Lot Factor}_i\]
The platform relies on a sophisticated hierarchy of controls to ensure data quality ExploreHT_QC.pdf.
There is a documented discrepancy between manufacturer-reported reliability and independent study results.
Official metrics report high precision ExploreHT_Validation.pdf: * IntraCV (Within-plate): Median ~11.2%. * InterCV (Between-plates): Median ~8.7%.
| Block | # of assays | Dilution factor | Intra-assay %CV mean | Inter-assay %CV mean |
|---|---|---|---|---|
| 1 | 742 | 1:1 | 23.3 | 20.7 |
| 2 | 1314 | 1:1 | 13.3 | 11.8 |
| 3 | 1204 | 1:1 | 9.8 | 7.1 |
| 4 | 1106 | 1:1 | 7.2 | 3.5 |
| 5 | 582 | 1:10 | 6.6 | 3.8 |
| 6 | 270 | 1:100 | 5.6 | 5.3 |
| 7 | 134 | 1:1000 | 11.0 | 6.2 |
| 8 | 68 | 1:100,000 | 8.6 | 12.4 |
A subset of 291 selected assays (~5%) that are used to assess CVs in Olink validation. This subset is based on proteins that are typically well-expressed in healthy plasma to enable the calculation of reliable CV values.
The LOD is the threshold where the protein signal is statistically distinguishable from the Negative Control background.
Reliability is strongly tied to the signal-to-noise ratio Rooney2025_ARIC.pdf: * Precision is inversely correlated with the percentage of samples above LOD (\(r = -0.77\)). * Assays where \(NPX < LOD\) are dominated by technical noise, leading to artificially inflated CVs.
| Column | Description | Type | Typical value |
|---|---|---|---|
| SampleID | The annotated sample ID | String | |
| Sample Type | Type of sample | String | PLATE_CONTROL, NEGATIVE_CONTROL, CONTROL, SAMPLE |
| WellID | Id for well | String | Capital letter A–H followed by number 1–12 |
| PlateID | Name of the plate the sample was run on | String | |
| DataAnalysisRefID | Reference ID for data analysis | String | |
| OlinkID | OlinkID for assay | String | |
| UniProt | UniProt ID for assay | String | |
| Assay | Gene name for assay | String | |
| AssayType | Type of assay | String | Amp_ctrl, inc_ctrl, ext_ctrl |
| Panel | Panel name | String | Explore_HT |
| Block | Name of the block the sample was run on | String | 1, 2, 3, 4, 5, 6, 7, or 8 |
| Count | The total number of counts | Integer | Greater than or equal to 1 |
| ExtNPX | Intermediate value between count and NPX: log2 of the ratio between data-point Count value and the count for the Extension Control assay for the same sample. | Double | -1.94701 |
| NPX | NPX value | Double | |
| Normalization | Type of normalization used in project | String | Plate control, Intensity or EXCLUDED |
| PCNormalizedNPX | NPX value displayed if plate control normalization has been chosen. | Double | 1.735509 |
| AssayQC | Overall QC status for an assay | String | NA, PASS, WARN |
| SampleQC | Overall QC status for a sample in a block | String | NA, PASS, WARN, FAIL |
| ExploreVersion | Software version of the module in NPX Explore HT & 3072 | String |
Olink provides normalized Parquet outputs upon request or based on experimental design (e.g., sample randomization). Intensity normalization is generally recommended as the primary method. This is because the standard NPX column will contain Intensity-normalized values, while PC-normalized values remain accessible via the dedicated PCNormalizedNPX column.
renv::restore()- The library is already synchronized with the lockfile.
library(tidyverse)
library(magrittr)
library(OlinkAnalyze)
library(knitr)
library(kableExtra)
source("R/olink_helpers.R")rp2_olink_fn <- "/Volumes/DCEG/CGF/Laboratory/Projects/MR-0084/RP0084-045/Data/NPXMap Exports/RP0084-045_Extended_NPX_2025-10-27.parquet"
d1_olink_fn <- "/Volumes/DCEG/CGF/Laboratory/Projects/DESL Aliquoting Projects/NAS_CS036024/Olink_DataDelivery/Q-13387_Hutchinson_Extended_NPX_2024-06-26.parquet"
fixedLOD_fn <- "/Volumes/DCEG/CGF/TechTransfer/Proteomics/Olink Explore/Resources/Analysis/Olink Analyze/Explore HT_Fixed LOD.csv"
rp2_npx <- read_NPX(rp2_olink_fn)ℹ This parquet file is for research use only:
"For Research Use Only. Not for use in diagnostic procedures."!
Multiple quantification columns detected (NPX, PCNormalizedNPX, Count). NPX will be used for downstream analysis.
dim(rp2_npx)[1] 783360 30
rp2_npx %>%
head() %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE) %>%
scroll_box(width = "100%")| SampleID | SampleType | WellID | PlateID | DataAnalysisRefID | OlinkID | UniProt | Assay | AssayType | Panel | Block | Count | ExtNPX | NPX | Normalization | PCNormalizedNPX | AssayQC | SampleQC | SoftwareVersion | SoftwareName | PanelDataArchiveVersion | PreProcessingVersion | PreProcessingSoftware | InstrumentType | IntraCV | InterCV | SampleBlockQCWarn | SampleBlockQCFail | BlockQCFail | AssayQCWarn |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| IO_3862_1158_1 | SAMPLE | A7 | RP0084-045_A | D10007 | OID45511 | EXT1 | Extension control 1 | ext_ctrl | Explore_HT | 1 | 17461 | 0 | 0 | Intensity | 0 | NA | PASS | 1.3.0 | NPX Map | 1.3.0 | 5.1.0 | ngs2counts | Illumina NovaSeq X Plus | NaN | NaN | 1 | 1 | 1 | 0 |
| IR_8487_1158_1 | SAMPLE | A8 | RP0084-045_A | D10007 | OID45511 | EXT1 | Extension control 1 | ext_ctrl | Explore_HT | 1 | 16613 | 0 | 0 | Intensity | 0 | NA | PASS | 1.3.0 | NPX Map | 1.3.0 | 5.1.0 | ngs2counts | Illumina NovaSeq X Plus | NaN | NaN | 1 | 1 | 1 | 0 |
| KF_0956_1158_1 | SAMPLE | A9 | RP0084-045_A | D10007 | OID45511 | EXT1 | Extension control 1 | ext_ctrl | Explore_HT | 1 | 2220 | 0 | 0 | Intensity | 0 | NA | PASS | 1.3.0 | NPX Map | 1.3.0 | 5.1.0 | ngs2counts | Illumina NovaSeq X Plus | NaN | NaN | 1 | 1 | 1 | 0 |
| UM_0172_1035_1 | SAMPLE | A10 | RP0084-045_A | D10007 | OID45511 | EXT1 | Extension control 1 | ext_ctrl | Explore_HT | 1 | 14407 | 0 | 0 | Intensity | 0 | NA | PASS | 1.3.0 | NPX Map | 1.3.0 | 5.1.0 | ngs2counts | Illumina NovaSeq X Plus | NaN | NaN | 1 | 1 | 1 | 0 |
| IA_7060_1158_1 | SAMPLE | A11 | RP0084-045_A | D10007 | OID45511 | EXT1 | Extension control 1 | ext_ctrl | Explore_HT | 1 | 1231 | 0 | 0 | Intensity | 0 | NA | PASS | 1.3.0 | NPX Map | 1.3.0 | 5.1.0 | ngs2counts | Illumina NovaSeq X Plus | NaN | NaN | 1 | 1 | 1 | 0 |
| Sample_12 | SAMPLE_CONTROL | A12 | RP0084-045_A | D10007 | OID45511 | EXT1 | Extension control 1 | ext_ctrl | Explore_HT | 1 | 17385 | 0 | 0 | Intensity | 0 | NA | PASS | 1.3.0 | NPX Map | 1.3.0 | 5.1.0 | ngs2counts | Illumina NovaSeq X Plus | NaN | NaN | 0 | 1 | 1 | 0 |
# drive sample information from the parquet data frame
rp2_sam <- rp2_npx %>% select(1:4) %>%
unique %>%
rename(plate = PlateID, well = WellID) %>%
mutate(column = paste0("Column ", substr(well, 2, 3))) %>%
mutate(row = substr(well, 1, 1))
rp2_sam There are 144 samples in the project rp2 and 5440 assays for each sample.
5440*144 = 783360
OlinkAnalyze::olink_displayPlateLayout(data = rp2_sam, fill.color="SampleType")inspect_olink_qcThe OlinkAnalyze package contains an internal function, npxCheck(), which validates NPX data integrity and identifies cases where entire samples or assays consist of missing values (NA). However, it does not provide granular details regarding Sample-Block QC status, nor does it explicitly report on assays or samples flagged with a “WARN” status. To address this, we define the custom function inspect_olink_qc. This function extends standard validation by summarizing Olink QC flags across specific blocks, providing a clearer overview of data quality for downstream analysis.
inspect_olink_qc <- function(dat) {
# 1. Identify Assays that are not "PASS" (excluding controls)
# Filtering for AssayType=="assay" ensures we focus on actual targets
flagged_assays <- dat |>
filter(AssayType == "assay") |>
select(OlinkID, Assay, Block, AssayQC) |>
distinct() |>
filter(AssayQC != "PASS")
# 2. Identify unique SampleIDs that are not "PASS" or "NA"
# Based on Olink definitions, "NA" usually refers to excluded assays, not failed samples
flagged_sample_ids <- dat |>
filter(!SampleQC %in% c("NA", "PASS")) |>
pull(SampleID) |>
unique()
# 3. Create a block-wise matrix of SampleQC status for flagged samples
# This helps visualize if a sample failed across all blocks or just one
flagged_samples_matrix <- dat |>
filter(SampleID %in% flagged_sample_ids) |>
group_by(SampleID, Block, SampleQC) |>
summarise(N = n(), .groups = 'drop') |>
mutate(status_label = paste0(SampleQC, " (n=", N, ")")) |>
group_by(SampleID, Block) |>
summarise(QC = stringr::str_flatten(status_label, collapse = ", "), .groups = 'drop') |>
tidyr::pivot_wider(names_from = Block, values_from = QC)
# Return results as a named list
list(
flagged_assays = flagged_assays,
flagged_sample_ids = flagged_sample_ids,
flagged_samples_matrix = flagged_samples_matrix
)
}filter(AssayType == “assay”) is applied to exclude all internal controls.
qc_results <- inspect_olink_qc(rp2_npx)
qc_results$flagged_assays
qc_results$flagged_samples_matrix